- Title
- Disclosed: an efficient depth-first, top-down algorithm for mining disjunctive closed itemsets in high-dimensional data
- Creator
- Vimieiro, Renato; Moscato, Pablo
- Relation
- Information Sciences Vol. 280, p. 171-187
- Publisher Link
- http://dx.doi.org/10.1016/j.ins.2014.04.044
- Publisher
- Elsevier
- Resource Type
- journal article
- Date
- 2014
- Description
- We focus, in this paper, on the computational challenges of identifying disjunctive Boolean patterns in high-dimensional data. We conduct our analysis focusing particularly in microarray gene expression data, since this is one of the most stereotypical examples of high-dimensional data. We devised a novel algorithm that takes advantage of the scarcity of samples in microarray data sets, allowing us to efficiently find disjunctive closed patterns. Our algorithm, Disclosed, mines disjunctive closed itemsets by exploring the search space in a depth-first, top-down manner. We evaluated the performance of our algorithm to execute such a task using real microarray gene expression data sets publicly available on the Internet. Our experiments revealed under what situations, the characteristics of a data set, our method obtain a good, bad or average performance. We also compared the performance of our method with the state of the art algorithms for finding disjunctive closed patterns and disjunctive minimal generators. We observed that our approach is two orders of magnitude more efficient, both in terms of time and memory.
- Subject
- disjunctive itemset; frequent itemset mining; closed itemset; rare itemset; formal concept analysis; microarray data
- Identifier
- http://hdl.handle.net/1959.13/1305933
- Identifier
- uon:21139
- Identifier
- ISSN:0020-0255
- Language
- eng
- Reviewed
- Hits: 11594
- Visitors: 2442
- Downloads: 0